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Abstract. We present a simple randomized polynomial time algorithm to ap- 
proximate the mixed discriminant of n positive semidefinite n x n matrices within a 
factor 2°( n \ Consequently, the algorithm allows us to approximate in randomized 
polynomial time the permanent of a given nxn non-negative matrix within a factor 
2°( ra ). When applied to approximating the permanent, the algorithm turns out to 
be a simple modification of the well-known Godsil-Gutman estimator. 



1. Introduction 



In this paper we address the question of how to approximate the permanent of a 
non-negative matrix, and, more generally, the mixed discriminant of positive semi- 
definite matrices. Our main result is that a simple modification of the well-known 
Godsil-Gutman estimator ([10], see also Chapter 8 of [19]) yields a randomized poly- 
nomial time algorithm, which, given an n x n non-negative matrix, approximates 
its permanent within a 2 0(n ) factor. It turns out that the ideas of the algorithm 
and the proof become more transparent when generalized to mixed discriminants. 
The first randomized polynomial time algorithm that approximates the permanent 
within a 2 0(n ) factor was suggested by the author in [5]. The algorithm described in 
this paper has some advantages compared to the algorithm from [5]. It has a much 
more transparent structure, it is much easier to implement and it is easily paralleliz- 
able. Besides, it sheds some additional light on the properties of the Godsil-Gutman 
estimator, which was studied in several papers (see [9], [16] and Chapter 8 of [19]). 

(1.1) Permanent. Let A = (ctjj) be an n x n matrix and let S n be the symmetric 
group, that is the group of all permutations of the set {1, . . . , n}. The number 



is called the permanent of A. If A is a 0-1 matrix, then per A can be interpreted 
as the number of perfect matchings in a bipartite graph G on 2n vertices Vx, . . . ,v n 



n 
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and ui, . . . ,u n , where (vi,Uj) is an edge of G if and only if = 1. To compute 
the permanent of a given 0-1 matrix is a #P-complete problem and even to estimate 
per A seems to be difficult. Polynomial time algorithms for computing per A are 
known when A has some special structure, for example, when A is sparse [11] or 
has small rank [4]. Polynomial time approximation schemes are known for dense 
0-1 matrices [12], for "almost all" 0-1 matrices (see [12], [21] and [9]) and for some 
special 0-1 matrices, such as corresponding to lattice graphs, (see [13] for a survey 
on approximation algorithms). However, not much is known on how to approximate 
in polynomial time the permanent of an arbitrary 0-1 matrix (see [15] for the fastest 
known "mildly exponential" approximation scheme), let alone the permanent of an 
arbitrary non-negative matrix. 

Let ti, . . . ,t n be real variables. Then the permanent per A can be expressed as 
the coefficient of t± ■ • ■ t n in the product of linear forms: 

(1.1.1) p eri= ^nE^ 

t=l j=l 

(1.2) Mixed Discriminant. Let Q\, . . . , Q n be n x n symmetric matrices and let 
ti, . . . ,t n be real variables. Then det(ii<3i + . . ■ + t n Q n ) is a homogeneous polynomial 
of degree n in t\, . . . ,t n . The number 

Qn 

D(Qi, ... ,Q n ) = tt, 7T- det(tiQi + . . . + t n Q n ) 

Oti ■ ■ ■ ot n 

is called the mixed discriminant of Qi, . . . ,Q n . Sometimes the normalizing factor 
l/n\ is used (cf. [18]). The mixed discriminant D(Qi, . . . , Q n ) is a polynomial in the 
entries of Qi, ... , Q n with coefficients —1 and 1. 

The mixed discriminant can be considered as a generalization of the permanent. 
Indeed, from (1.1.1) we deduce that for diagonal matrices Qi, ... , Q n 
(1.2.1) 

D(Q 1 , ... , Q n ) = per A, where Q { = diag{a a , . . . , a in } and A = (ay). 

Mixed discriminants were introduced by A.D. Aleksandrov in his proof of the Alek- 
sandrov - Fenchel inequality for mixed volumes ([2], see also [18]). The relation 
between the mixed discriminant and the permanent was used in the proof of the Van 
der Waerden conjecture for permanents of doubly stochastic matrices (see [6]). 
The mixed discriminant is linear in each argument, that is 

D(Qi, . . . , Qi-i, aQi + (3Q[, Q i+1 , ... , Q n ) 

= a>D(Qi, . . . , Qi-i, Qi, Qi+i, • • • , Q n ) + (3D(Qi, . . . , <5?-i, Q[, Qi+i, ■ ■ ■ , Qn), 
and D(Qi, . . . , Q n ) > provided Qi, . . . ,Q n are positive semidefmite (see [18]). 

The paper is organized as follows. In Section 2, we describe the algorithms for 
computing mixed discriminants and permanents, state the main results, and discuss 
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them. Section 3 addresses the behavior of quadratic forms on random vectors drawn 
from the Gaussian distribution in IR n - our main tool for analyzing the algorithms. 
Section 4 contains proofs of the main results of the paper. In Section 5, we present 
an application of the mixed discriminant for counting problems and also make some 
remarks on "binary versions" of our algorithms. 

2. Main Results 

Our algorithms use two standard procedures from linear algebra: computing the 
determinant of a matrix and computing a decomposition Q = TT*, where Q is a 
positive semidefinite matrix and T* is the transpose of T. One can compute the 
determinant of a given n x n matrix using 0(n 3 ) arithmetic operations (see, for 
example, Chapter 2, Section 7 of [8]). For a positive semidefinite n x n matrix Q 
one can compute a decomposition Q = TT*, where T is a lower triangular matrix, 
by using 0(n 3 ) arithmetic operations and n times taking square root from a non- 
negative number (see, for example, Chapter 2, Section 10 of [8]). The "random" part 
of the algorithms consists of sampling n 2 random variables independently from the 
standard Gaussian distribution in R with the density 

Various ways to simulate this distribution from the uniform distribution on the in- 
terval [0,1] are described in Section 3.4.1 (C) of [17]. In particular, this allows us 
to sample vectors x — (x±, . . . ,x n ) G W 1 from the standard n-dimensional Gaussian 
distribution in R n with the density 

ip n (x) = (2iry n/2 exp{-\\x\\ 2 /2}, where ||x|| 2 = x\ + . . . + x 2 n , 

by sampling the coordinates x\, . . . ,x n independently from the one-dimensional Gauss- 
ian distribution. We will write ip(x) instead of i]) n {x) if the choice of the ambient space 
M n is clear from the context. 

To simplify our analysis, we assume that we operate with real numbers and that we 
can perform arithmetic operations, take square root and sample from the Gaussian 
distribution exactly. In Section 5.1 we briefly discuss how to adjust our algorithms 
to the "binary model" of computation, that is, if we allow only integers (rational 
numbers) and arithmetic operations on them. As a general source on algorithms and 
complexity we use [1]. 

Now we describe our main algorithm. 

(2.1) Algorithm for computing the mixed discriminant. 
Input: Positive semidefinite n x n matrices Qi, . . . , Q n . 

Output: A number a approximating the mixed discriminant D(Qi, . . . , Q n ). 
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Algorithm: For % = 1, . . . , n compute a decomposition Qi = T,{T* . Sample inde- 
pendently n vectors ui, . . . ,u n at random from the standard Gaussian distribution 
in W 1 with the density ip(x) = (2ir)~ n/2 exp{-||a;|| 2 /2}. Compute 



a 



= (det[T lUl , . . . ,T n u n ]j , 



the squared determinant of the matrix with the columns Tiiii, . . . , T n u r 
Output a. 

(2.2) Theorem. 

(2.2.1) The expectation of a is the mixed discriminant D(Q 1 , . . . , Q n ); 

(2.2.2) For any C > 1 the probability that 

a>C-D(Q u ...,Q n ) 

does not exceed C _1 ; 

(2.2.3) Let 

c = exp<( — ^= / (lnt)e^ /2 dt )■ w 0.2807297419. 



{^r (int)e "' v2dt } 



Tiien for any 1 > e > the probability that 

a<(ec ) n D(Q 1 ,... ,Q n ) 

, 8 
does not exceed 



n In 2 e 

The algorithm becomes especially simple when we apply it to compute the mixed 
discriminant of diagonal matrices, that is the permanent of a non-negative matrix. 

(2.3) Algorithm for computing the permanent of a non-negative matrix. 

Input: Non-negative n x n matrix A. 

Output: A number a approximating the permanent per A. 

Algorithm: Sample independently n 2 numbers Uij : i,j = 1, . . . ,n at random 
from the standard Gaussian distribution in R with the density ip(x) = p ~ x ^ 2 



2,7V 

Compute the n x n matrix B = (fey), where by = Uij^JaTj. Compute a = (det.B) 2 
Output a. 

(2.4) Theorem. 

(2.4.1) The expectation of a is the permanent per A; 

(2.4.2) For any C > 1 the probability that 

a > C ■ per A 

does not exceed C~\- 
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(2.4.3) Let 

-j= J (lnt)e-* 2/2 dt > « 0.2807297419. 



Tien for any 1 > e > the probability that 

« < (ec ) n per A 

, 8 

does not exceed 5— . 

nln e 

As implied by (1.2.1), Theorem 2.4 is a straightforward corollary of Theorem 2.2. 
From (2.2.2) and (2.2.3) it follows that for all sufficiently large n, the output a of 
Algorithm 2.1 satisfies the inequalities 

(0.28) n D(Q 1: ... ,Q n )<a< ZD{Q 1 , . . . , Q n ) 

with probability at least 0.6. We can make the probability as high as 1— e for any e > 
by running the algorithm independently 0(ln(e -1 )) times and taking the median of 
the computed o's (cf. [14]). Hence we get a randomized polynomial time algorithm 
approximating the mixed discriminant of positive semidefinite matrices (and hence 
the permanent of a non- negative matrix) within a simply exponential factor 2 0<yn \ 

(2.5) Relation to the Godsil-Gutman estimator. It is seen immediately that 
Algorithm 2.3 is a modification of the Godsil-Gutman estimator (see [10]) and Chap- 
ter 8 of [19]). Indeed, in the Godsil-Gutman estimator we sample U; L j from the binary 
distribution: 

1 with probability 1/2, 
—1 with probability 1/2. 

Furthermore, parts (2.4.1) and (2.4.2) of Theorem 2.4 remain true as long as we 
sample u,ij independently from some distribution with the expectation and variance 
1. However, (2.4.3) is not true for the binary distribution. Indeed, let A = (aij) be 
the following n x n matrix 

«.. = (! if * = J or {i,j} = {1,2}, 
%3 1 elsewhere. 

Then per A = 2. If G { — 1, 1} are sampled from the binary distribution, then 
ot — {u\\U22 — ui2U2i) 2 . So, we get that a = with probability 1/2 and a = 4 
with probability 1/2. Apparently, sampling from continuous distributions allows us 
to approximate better. 

Another (discrete) distribution for the Godsil-Gutman estimator was studied in 
[16] and [9]. It was shown that for a "typical" 0-1 matrix we get a polynomial 
approximation scheme for the permanent [9], whereas for a "worst case" 0-1 matrix 
it allows us to construct an approximation scheme [16] whose complexity, while still 
exponential, is significantly better than the complexity of known exact methods (cf. 
Chapter 7 of [20]). 



Uij — 
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(2.6) How well can we approximate the permanent in polynomial time? 

Suppose we have a polynomial time (probabilistic or deterministic) algorithm that 
for any given n x n (non-negative or 0-1) matrix A computes a number a such that 
0(n)per A < a < per A. What sort of function might (f>(n) be? For an n x n matrix 
A and k > let us construct the nk x nk block-diagonal matrix A <g> I k , having k 
diagonal copies of A. We observe that A <g> Ik is non-negative if A is non-negative and 
A® Ik is 0-1 if A is 0-1, and that per A = (per A® Ik) ■ Applying our algorithm to 
A® Ik and taking the root we get an approximation of per A within a factor 1 / fe (n/c). 
So, if we suppose that is the best possible, we may assume that (j>(n) > 1 / fc (n/c). 
There are few reasonable choices for such functions 0. 

a) 4>(n) = I. This doesn't look likely, given that the problem is #P-hard. 

b) For e > one can choose <f) e (n) = 1 — e and the algorithm is polynomial in e _1 . 
In the author's opinion this conjecture is overly optimistic, especially for arbitrary 
non-negative matrices. 

c) For e > one can choose (j> e (n) = (1 — e) n , but the algorithm is not polynomial 
in e _1 . This type of approximation was conjectured by V.D. Milman. 

d) <f>(n) = c n for some fixed constant c. This is the type of a bound achieved by 
Algorithm 2.3. 

We note that functions 0(n), decreasing faster than a simply exponential function 
of n (for example, (f>(n) = l/n\) are not interesting since they are beaten by c n of 
Algorithm 2.3. The author does not know if the constant c ~ 0.28 from Theorem 2.4 
can be improved. 

3. The Gaussian Measure and Quadratic Forms 

(3.1) Notation. Let us fix the standard orthonormal basis in R n . Thus we can 
identify n x n matrices with linear operators on IR n . 
Let (•, •) be the standard inner product on W 1 , that is 

(x, y) = x x y x + ... + x n y n , where x = (x 1: . . . , x n ) and y = (y 1: . . . , y n ). 

We denote the corresponding Euclidean norm by || • ||. 

Let \i be the standard Gaussian measure on W 1 with the density 

i>{x) = (27r)- n/2 exp{-||a:|| 2 /2}. 

For a /i-integrable function / : R™ — > R we define its expectation 



E(/) = / fdfjL= f(x)ij>{x) dx. 

Furthermore, if F : R n — »• R m , F(x) = (fi(x), ... , f m (x)), we let 
E(F) = (E(/ 1 ),...,E(/ m ))6K™. 

If x G R™ is a vector then we denote by x <8> x the n x n matrix whose (i, j)-th 
entry is Xi ■ Xj. So, x <S> x is a positive semidefinite matrix of rank 1, provided x ^ 0. 
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Let Q be a symmetric n x n matrix and q(x) = (x, Qx) be the corresponding 
quadratic form. We define the trace of q: 

n 

i=l 

We start with a simple lemma. 

(3.2) Lemma. 

(3.2.1) Let q : R™ — ► R be a quadratic form. Then 

E(g) =tr(g). 

(3.2.2) Let us fix an n x n matrix T. Then 

E(Tu®Tu) = TT*, 
where u G R™ is sampled from the standard Gaussian distribution /i. 

Proof. Since E(xf) = 1 for i = 1, . . . n and E(xjXj) = 0, the proof of (3.2.1) follows. 
Therefore, E(u <g)u) = I, the identity matrix, and hence 

E(Tm ® Tu) = E(T(u ® = TE(m ® m)T* = TT*, 

and (3.2.2) follows. □ 

Now we prove the main result of this section. 

(3.3) Theorem. Let q : R n — > R be a positive semidefinite quadratic form, such 
that E(q) = 1. Let 




-1.270362845. 



Tben 

(3.3.1) C <E(lng)<0 
and 

(3.3.2) 0<E(ln 2 g)<8. 

Proof. Since In is a concave function, we have E(lng) < ln(E(g)) = 0. Let us 
decompose q into a non-negative linear combination q = Xiqi + . . . + \ n q n of positive 
semidefinite forms qi of rank 1. We can scale so that E(g«) = 1 for i = 1, . . . , n and 
then we have Ai + . . . + X n = 1. In fact, one can choose Aj to be the eigenvalues of q 
and qi = (x, Mj) 2 , where is the corresponding unit eigenvector. Since In is a concave 
function we have ln(Aigi + . . . + \ n q n ) > X± In qi + . . . + \ n In q n . Furthermore, if qi 
is a positive semidefinite form of rank 1 such that E(gj) = 1, then by an orthogonal 
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transformation of the coordinates it can be brought into the form qi{x) = x\ (see 
(3.2.1)). Therefore, E(ln^) = E(m:r 2 ) = C and 

E(lng) > A 1 E(lng 1 ) + . . . + A„E(lng n ) = (A x + . . . + X n )C = C , 

so (3.3.1) is proven (we note that this reasoning proves that E(lng) is well-defined). 
Let X = {x E W l : q(x) < 1} and Y = R n \ X. Then 

E(ln 2 g)= / ip(x)ln 2 q(x) dx + ip(x) In 2 q(x) dx. 
Jx Jy 

Let us estimate the first integral. Decomposing q = Xiqi + . . . + X n q n as above, we 
get In g > Ai In gi + . . . + A n In q n . Since In q(x) < for x G X, we get that 

n 

ln 2 q(x) < ^ XiXj (\n qi(x)) (In qj(x)) 

for x £ X. Therefore, 

n „ 

tp(x) In 2 q(x) dx < 2_, XiXj / ip(x)(lnqi(x)) (lnqj(x)) dx 
< AjAj / VK 2 -) In 2 Qj^) ^ 1 / VK 2 -) hi 2 qj(x) dx 

i,i i V /v J \ Jx 

(we applied the Cauchy-Schwartz inequality) 

^ 1/2 

^^■(EOii 2 ^)) (E(ln 2 g,)) . 



Now, as in the proof of (3.3.1) we have 



o r+oo 

E(ln 2 qi ) = E(ln 2 x 2 ) = -= / (In 2 t)e~ t2/2 dt « 6.548623960 < 7. 

V27T Jo 

Summarizing, we get 

/ 4>(x) In 2 g(x) rfa; < A * A i ( E ( ln2 *)) ( E ( ln2 %)) ^ 7 Yl XiX i = 7 ' 
Jx i,j=i i,j=i 

Since for < ln t < \ft for t > 1 we have 

y V'C^) In 2 g(x) dx < J q{x)ip{x) dx < E(g) = 1. 

Therefore, E(ln 2 q) < 7 + 1 = 8 and (3.3.2) is proven. □ 
Finally, we will need the following simple result. 
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(3.4) Lemma. Let u±, . . . ,u n be vectors from W 1 . Then 

D(ui ®U!,... ,u n ®u n ) = ^det[ui,... ,ti„]) , 
the squared determinant of the matrix with the columns u±, . . . ,u r 



Proof. Let e±, . . . , e n be the standard orthonormal basis of ¥L n . Let G be the operator 
such that G(ei) — Ui for i — 1, . . . , n. Then Ui <8> Ui = G(e.{ ® Ci)G* and from the 
definition of Section 1.2 we get 

dti... dt n 

= -Q t Qf det ^(iiei ® ei + . . . + t n e n ® e n )G*J 

det(GG*) 9 det (tiei ® ei + . . . + t w e w ® e ra ) = (detG) 2 
oti . . . ot n V / 



D(ui ®u 1 ,...,u n ®u n ) = — — det ® ui + . . . + t n u n <g> e n j 



□ 



4. Proof of the Main Results 



(4.1) Conditional expectations. In this subsection, we summarize some general 
results on measures and integration, which we exploit later. As a general source we 
use [3]. 

Suppose that we have k copies of the Euclidean space M. n , each with the standard 
Gaussian probability measure /i. We will consider functions /:l"x...xl" — > K. 
that are defined almost everywhere and integrable with respect to the measure = 
H x . . . x fi. Let /(tii, • • • , tifc) be such a function. Then for almost all (k — l)-tuples 
(tii,... , tifc-i) with tij G M n , the function /(tii, . . . ,Uk-i,-) is integrable (Fubini's 
Theorem) and we can define the conditional expectation g(ui, . . . ,Uk-i) = E^(/) by 
letting 

E fe (/)(tii, . . . , « fc _i) = / /(tii, • • • , tife-i, u k )ifj(u k ) du k . 
Fubini's theorem implies that 

E(/)=E 1 ...E fe (/), 

where E is the expectation with respect to the product measure v k . Tonelli's Theorem 
states that if / is z/ fc -measurable and non-negative almost surely with respect to u k 
then / is z/ fc -integrable, provided E x . . . E fc (/) < +oo. 

If /(tii, . . . ,tij) is a function of % < k arguments, we may formally extend it to 
M. n x . . . x M. n (k times) by letting /(tii, • • • , tife) = /(tii, • • • , tij). The distribution of 
values of /(tii, • • • , tij) with respect to Vi is the same as the distribution of values of 
/(tii, • • • , Uk) with respect to v k . In particular, if /(tii, ...,««) is z/j-integrable then 
/(tii, ... ,u k ) is z/ fc -integrable. 
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We note the following useful facts: 

(4.1.1) The linear operator E fc is monotone, that is, if f(ui, . . . ,u k ) < g(u±, . . . ,u k ) 
almost surely with respect to v k , then E k (f) < E k (g) almost surely with respect to 



(4.1.2) If f(ui,... ,u k ) is integrable and g(u 1 , . . . ,Uj), i < k is a function, then 



(4.1.3) If / = a is a constant almost surely with respect to v k , then E fe (/) = a almost 
surely with respect to v k -\. 

First, we prove the following technical lemma (a martingale inequality). 

(4.2) Lemma. Let f k {u\, . . . ,u k ), k — 1, . . . ,n be integrable functions on W n x 
...xl n (k times), and let v n — /j, x . . . x /i (n times). Suppose that for some numbers 
a and b we have 

a < Efc(/fc) and ~E k (f k ) < b almost surely with respect to v k _\ 

for k — 1, . . . , n. 

Then for any 5 > we have 



Proof. Let g k = E fe (/ fc ) and h k = f k — g k . Since g k does not depend on u k , using 



(4.1.2) we have E k (h 2 k ) = E fc (/|) - 2E k (g k f k ) + E k (g 2 k ) = E k (f 2 ) - g\. Summarizing, 



we may write 

fk = gk + h k , where E k (h k ) =0, g k > a, and E k (h 2 k ) < b 
almost surely with respect to v k -\. Let 



V k -\\ 



E k {gf)=gE k {f); 




k=i 




k=i 



Now for U — (ui, . . . , u n ) we have 




k=i 




n 



k=l 



< V, 



n 




E(H 2 ) 



1 

5n 2 



k=l 



l<i<j<n 



(we used Chebyshev's inequality). 
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We note that it is legitimate to pass to global expectations E here. Indeed, since 
h\ is non-negative and E k h\ < E k f 2 < b it follows by (4.1.1), (4.1.3) and Tonelli's 
Theorem that h\ is z/ n -integrable. Since \Khj\ < (hf + h?)/2, the products hihj are 
also z/ n -integrable. Therefore, H 2 is z/ n -integrable. 

Since h k does not depend on u k+ i, ■ ■ ■ , u n , using (4.1.2) we have E(^) = Ei . . . E n h\ 
= Ei . . .E k h\ and since E k h\ < b almost surely on v k -\, by (4.1.1) and (4.1.3) we 
get that E(/z|) < b for each k — 1, . . . ,n. Furthermore, since /ij does not depend on 
u i+ i, ... ,u n for j > i, using (4.1.2) and (4.1.3) we have 

E(hihj) = Ei . . . E n (hihj) =E l ... Ejfchj) = . . . K ; = 0. 

The proof now follows. □ 

Now we are ready to prove the main results of the paper. 

(4.3) Proof of Theorem 2.2. The output a of Algorithm 2.1 is a function of 
vectors u±, . . . ,u n G M n , which are drawn at random from the standard Gaussian 
distribution \i in M n . We consider the distribution of a(ui, . . . ,u n ) with respect to 
the product measure v n = \i x . . . x \i (n times). 
Using Lemma 3.4, we may write 

a = a(u!, ... , u n ) = (det[Ti«i, . . . , T n u n ] j = Dfaui ® Titii, . . . , T n w„ ® T n w n ). 
For k — 0, . . . , n let 

Q!fc(til, ... , Ufc) = D(TiUi ® Titii, . . . , T fc M fc ® T fc M fc , Qfe+l, • • • , <5n)- 

In particular, a = D(Q±, . . . , Q n ) and a n = a. Since the mixed discriminant of posi- 
tive semidefmite matrices is non-negative (Section 1.2), we deduce that otk(ui, . . . , Uk) 
is non-negative. By (3.2.2) we have EkiTkUk <8> T^Uk) = T^T^ = Qk- Since is a 
polynomial in the coordinates of ui, ... , and the mixed discriminant is linear in 
every argument (see Section 1.2) we can interchange D and E^: 

Efc(ctfc) 

= £>(TiMi <g> 7i«i, . . . , T fc _iM fc _i ® T fc _i« fc _i, E fe (T fe u jfc ® T fc M fc ), Q fc+1 , ... , Q n ) 

= D(TiUi <g> Tiui, . . . ,T fc _iw fc _i ® T k - 1 u k - 1 ,Qk,Q k+1 , . . . ,Q n ) = a k -i- 
Since is non-negative and z/ n -measurable, applying theorems of Tonelli and Fubini, 
we have 

(4.3.1) E(a) = EiE 2 . . . E n (a n ) =E l ... E k (a k ) = a = D(Q U . . . , Q n ). 

and (2.2.1) is proven. Since a(u±, . . . ,u n ) is non-negative, (2.2.2) follows by Cheby- 
shev's inequality. 

If D(Qi, . . . , Q n ) = then by (2.2.1) and non-negativity of a it follows that a is 
identically and (2.2.3) would follow. Therefore, without loss of generality, we may 
assume that D(Qi, . . . , Q n ) > 0. Since a k (ui, . . . , u k ) is a non-negative polynomial, 



12 



ALEXANDER BARVINOK 



by (4.3.1) we conclude that a k (ui, . . . ,u k ) > almost surely with respect to v k . For 
k = 1, . . . , n let 

a fc (ui, . . . , u k ) a k 

g k (ui, ... ,u k ) = = — — -, 

a k -!(ui, . . . ,u k -i) E k (a k ) 

Hence we may write 

a(ui, ... ,u n ) T~T / v 

«^w = n»(». *> 

almost surely with respect to v n . Since the mixed discriminant is linear in each 
argument, g k (ui, . . . ,u k ) is a quadratic form in u k for any fixed ui, . . . ,u k -i, such 
that a k -i(ui, . . . ,u k -i) > 0. Furthermore, since the mixed discriminant is non- 
negative for positive semidefinite arguments, we conclude that g k (ui, . . . ,u k ) is a 
positive semidefinite quadratic form in u k for every such choice of u±, . . . , u k _i. Since 
Efe(g'fc) = 1 almost surely with respect to v k -i, by Theorem 3.3 we conclude that 

C < E k (\ng k ) < with C = —= \ (lnt)e^ 2/2 dt 

v2ir Jo 

and that |Efc(ln 2 gfc)| < 8 almost surely with respect to v k -\. In particular, since 
ln 2 (/ fc is non-negative almost surely with respect to v k , we deduce that In 2 g k is v n - 
integrable, and since | In c? fc j < 1 + In 2 g k we deduce that \ng k is z/ n -integrable. 
Now for U =(«!,... , u n ) we have 

vAU: ± 1 --^\ <(ecoA=Ju: 1 -J ^^ \<C + lne 




D(Q u ...,Q n ) ~ v u/ J { ' n \D(Q U 

= i/„|(«i, . . . ,M n ) : ^y^lnff fc (^!, . . . ,u fc ) < Co + lneJ. 

To complete the proof of (2.2.3), we use Lemma 4.2 with f k = lng k , a = Co, 6 = 8 
and 5 = — In e. □ 

(4.4) Proof of Theorem 2.4. For « = 1, . . . , n let us define the diagonal matrices 
Qi =diag{<2ji, . . . , Oj n }. Algorithm 2.3 with the input A is Algorithm 2.1 with the 
input Qi,... ,Q n - By (1.2.1) per A = D(Q 1 , . . . ,Q n ), and the proof follows by 
Theorem 2.2. □ 



5. Remarks 

(5.1) The algorithms in the binary model. Suppose we want to operate in 
the standard binary (Turing) model of computation. That is, we allow arithmetic 
operations with integral (rational) numbers stored as bit strings (see [1]). In the 
probabilistic setting, we suppose also that we can generate a random bit, that is we 
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can sample a random variable x G {0, 1}, which assumes either value with proba- 
bility 1/2. Algorithms 2.1 and 2.3 can be transformed into randomized polynomial 
time algorithms in this model, which approximate the mixed discriminant, resp. the 
permanent within a 2°^ factor. This transformation is relatively straightforward for 
Algorithm 2.3 and first we briefly sketch it here, omitting tedious details (although 
it may not be the most efficient binary version). 

We are given a non-negative integer matrix A and we want to approximate per A. 
To compute the matrix B from Algorithm 2.3 we need to compute square roots 
y/aij and to sample variables U; L j from the standard Gaussian distribution in R. It is 
known that the square root of a positive rational number can be computed within any 
given error e > in time that is polynomial in hie -1 . We observe that if we choose 
Uij : 1 < i,j < n independently from the standard Gaussian distribution in R, then 
we will have \uij\ < n for all i,j with the probability that goes to 1 exponentially fast 
as n — > +oo. So, we compute v /OiJ for > 1 with such a precision e that for any 
choice of : \uij\ < n the value of the output a = (det-B) 2 gets computed with an 
error at most 10~ n (we recall that to compute the determinant we need arithmetic 
operations only). One can show that the bit size of e can be bounded by a polynomial 
in the input size. We note that if A is a 0-1 matrix, we don't need square roots. 

The next step is to approximate sampling from the standard Gaussian distribution 
in R. Let us choose a variation of the "polar method" (see Section 3.4.1 (C) of [17]). It 
can be shown that if x and y are independent random variables, uniformly distributed 
on [0,1], then u = sin(27n/)V— 2 In a; has the standard Gaussian distribution in R. 
Let (xij,yij) : 1 < i,j < n be the coordinates in the 2n 2 -dimensional unit cube 
[0,1]" 2 x [0, l]™ 2 = [0,1] 2 ™ 2 and be the coordinates in R™ 2 . This allows us to 
construct a map 

$ : [0, if x [0, if — > R n \ where ^(x, y) = sm(2n yij ) v /= 2Tn~^~ 

such that the push-forward measure of the standard Lebesgue measure A on [0, l] 2 " 2 
is the standard Gaussian measure fi in M n . The output u{uij) is a function on W 1 . 
The map $ is singular on the part of the boundary where Xij = or Xij = 1, so we 
find a parallelepiped II C [0, l] 2 " 2 , such that A (II) > 0.99 for all sufficiently large n 
and 1 — 1/n 3 > x i3 > 1/n 3 for any point in LL The map $ restricted to IT is Lipschitz, 
so we are able to find a rational 5 > such that if \\zi — 22 1| < 5 for z\, Z2 € II then 
la^^x)) — a($(z2))| < 10~ n . It can be shown that the size of 5 can be bounded 
by a polynomial in the input size. Next, we have to approximate sampling from 
the Lebesgue measure on II by the binary sampling. We do it by sampling each 
coordinate x^ and yij independently. To sample a coordinate x we sample iV random 
bits b\, . . . , &at and let 

N 

x = Y,^ k h- 

k=i 
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The 2 2n N points that can be obtained in such a way form a lattice grid in [0, l] 2n . 
We choose N so large that this grid forms a 5- net in [0, l] 2ri . One can choose an N 
that is bounded by a polynomial in n and In 8. Now we can approximate sampling 
from the standard Gaussian measure in R n : we sample a point z G [0, l] 2n as above, 
accept it if z G II and compute u = $(a) approximately with a sufficiently high 
precision, so that the corresponding value of a(u) = (det.B) 2 gets computed with an 
error at most 2 • 10~ n (taking into account the error from approximate computation 
of y/Oij)- Again, this can be done in polynomial time. 
Finally, the output a < 3 • 10 _n is rounded to 0. 

It is interesting to note, that even to approximate the permanent of a 0-1 ma- 
trix (a problem, which sounds purely combinatorial) we seem to have to deal with 
approximate computation of such "non-combinatorial" functions as sin and In. 

Algorithm 2.3 is modified similarly, except that on the first step we perturb the 
matrices Qi i — ► Qi + el, where / is the identity matrix and e > is such that 
the value of D(Qi, . . . ,Q n ) changes by not more than 10~ n (the bit size of e can 
be bounded by a polynomial in the size of the input). Now Qi are strictly positive 
definite and this makes computation of the decomposition Qi = T{F* stable, so that 
we can compute Tj with the desired precision. 

(5.2) An application of the mixed discriminant to counting. Suppose we are 
given a rectangular nxm matrix A with the columns u±, . . . , u m , which we interpret 
as vectors from M n . Suppose that for any subset Id {1, . . . , m}, / = {ii, . . . ,i n }, 
the determinant of the submatrix Aj = [u^, . . . , WjJ is either 0, —1 or 1. Such an A 
represents a unimodular matroid on the set {1, . . . , m} and a subset I with det Aj ^ 
are called a base of the matroid (see [22]). 

Suppose now, that the columns of A are colored into n different colors, which 
induces a partition {1, . . . , m} = Ji U . . . U J n . We are interested in the number 
of bases that have precisely one index of each color. Let us define the positive 
semi definite matrices Qi, . . . , Q n as follows: 

Qk = ^2ui®Ui, k=l,...,n. 

i£.J k 

Then the number of such bases can be expressed as D(Qi, . . . ,Q n )- Indeed, using 
the linearity of the mixed discriminant (Section 1.2) and Lemma 3.4, we have 

D(Q ± , . . . , Q n ) = ^2 D(u h ® u h , . . . ,u in ®u in ) 

I={h,... ,i„} 

= ^2 (detfw^,... ,u in f) , 

I={il,... ,in} 

where the sums are taken over all n-subsets /, having precisely one element of each 
color and the proof follows. 
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(5.2.1) Example. Trees in a graph. Suppose we have a connected graph G with 
n vertices and m edges. Suppose further, that the edges of G are colored into n — 1 
different colors. We are interested in spanning trees T in G such that all edges of T 
have different colors. Let us number the vertices of G by 1, . . . , n and the edges of 
G by 1, . . . ,m. Let us make G an oriented graph by orienting its edges arbitrarily. 
We consider the truncated incidence matrix (with the last row removed) A = (a^) 
for 1 < i < n — 1 and 1 < j < m as an (n — 1) x m matrix such that 

{1 if i is the beginning of j 
—1 if i is the end of j 
otherwise. 

The spanning trees of G are in one-to-one correspondence with non-degenerate (n — 
1) x {n — 1) sub matrices of A and the determinant of a such a sub matrix is either 
1 or — 1 (see, for example, Chapter 4 of [7]). Hence counting colored trees reduces 
to computation of the mixed discriminant of some positive semidefinite matrices, 
computed from the incidence matrix of the graph. 
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